Lapidot IDIAP - RR 02 - 56 WHAT IS BETTER : GMM OF TWO GAUSSIANS OR TWO CLUSTERS WITH ONE GAUSSIAN ?

نویسنده

  • Itshak Lapidot
چکیده

In this report, we provide a theoretical discussion on temporal data cluster analysis: does the data come from one source or two sources; is it better to cluster the data into two clusters or leave it as one cluster. Here we analyse only the simplest case: when the data comes from two symmetric Gaussian probability-densityfunctions (pdfs), i.e., with same variance and same absolute value of the mean, with the same prior probability per Gaussian. The data consists of segments with an a-priori known segment length. It will be shown that if the data belongs to two different Gaussian models, the likelihood of two clusters is always higher or equal than the one of a GMM with two Gaussians for any mean, variance, and segment length. If the data belongs to the GMM, the likelihood of two clusters might be either higher or less than the GMM one. Key Terms: clustering, expectation maximization, Gaussian mixture model, temporal data clustering. The author wants to thank the Swiss Federal Office for Education and Science (OFES) in the framework of both EC/OFES “MultiModal Meeting Manager (M4) project” and the Swiss National Science Foundation, through the National Center of Competence in Research (NCCR) on "Interactive Multimodal Information Management (IM2)" for supporting this work. 2 IDIAP Research Report 02-56

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved cluster model selection method for agglomerative hierarchical speaker clustering using incremental Gaussian mixture models

In this paper, we improve our previous cluster model selection method for agglomerative hierarchical speaker clustering (AHSC) based on incremental Gaussian mixture models (iGMMs). In the previous work, we measured the likelihood of all the data points in a given cluster for each mixture component of the GMM modeling the cluster. Then, we selected the N -best component Gaussians with the highes...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Skew Gaussian Mixture Models for Speaker Recognition

The current paper proposes skew Gaussian mixture models for speaker recognition and an associated algorithm for its training from experimental data. Speaker identification experiments were conducted, in which speakers were modeled using the familiar Gaussian mixture models (GMM), and the new skewGMM. Each model type was evaluated using two sets of feature vectors, the mel-frequency cepstral coe...

متن کامل

Data-Driven UBM Generation via Tied Gaussians for GMM-Supervector Based Accent Identification

This paper presents a new approach to exploit data-driven universal background model (UBM) generation using tied Gaussians for accent identification (AID). The motivation of the proposed algorithm is to potentially utilize broad phoneticspecific accent characteristics by Gaussian mixture model (GMM) and examine data-driven phonetically-inspired UBM creation for GMM-supervector based accent clas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002